Skip to content

Conversation

@awln-temporal
Copy link
Contributor

@awln-temporal awln-temporal commented Nov 18, 2025

What changed?

Add implementation of CHASM List/Count Runs. Given a CHASM archetype and supported MemoType, a Visibility Query can be ran against both custom and CHASM search attributes. Both operations wrap around the Visibility Manager methods to List/Count Workflows.

In the initial handler of the query in visibility_store.go, the query converter is passed the chasm VisibilitySearchAttributesMapper and the archetypeID. When resolving query parameters, the order of resolving aliases to field names is Custom -> CHASM -> System/Predefined attribute.

Removes chasm transient dependency on persistence package

Fixes visibility task executor to emit Warning Log instead of returning error when custom search attribute is not found. (Edge case when search attributes get deregistered between task creation and execution)

Why?

Required to support CHASM Visibility and Component authors.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

@awln-temporal awln-temporal force-pushed the CHASMVisibilityQuery branch 4 times, most recently from 1b7b1d1 to db42bca Compare November 20, 2025 18:52
@awln-temporal awln-temporal force-pushed the CHASMVisibilityQuery branch 3 times, most recently from c71cd57 to 6d0b6fa Compare November 20, 2025 19:10
Comment on lines 16 to 17
UserMemoPrefix = "__user__"
ChasmMemoPrefix = "__chasm__"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UserMemoPrefix = "__user__"
ChasmMemoPrefix = "__chasm__"
UserMemoKey = "__user__"
ChasmMemoKey = "__chasm__"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the var names.

@awln-temporal awln-temporal force-pushed the CHASMVisibilityQuery branch 7 times, most recently from 74fae89 to 23c1374 Compare November 21, 2025 21:58
Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will let others review the implementation but the API LGTM.

Just please document all of the exported functions. CHASM will be used by a lot of developers and we want to help people out as much as possible.

Comment on lines +87 to +89
NamespaceID string
NamespaceName string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to take in both?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NamespaceID is what we store in persistence, and NamespaceName is needed for dynamic config and custom search attributes mapper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting that matching needs visibility manager as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduler has some constraints in their query, so it needs the visibility manager to build the query converter IIRC.

return fqn, ok
}

// ComponentByID returns the registrable component for a given archetype ID.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz also comment that those methods should not be used by chasm library developer. and also for RegistrableComponent.SearchAttributesMapper()

we probably need to create a separate interface for developer facing registry later.

if ok {
newMemo := memoProvider.Memo(immutableContext)
if !maps.EqualFunc(n.currentMemo, newMemo, isVisibilityValueEqual) {
if !proto.Equal(n.currentMemo, newMemo) {
Copy link
Member

@yycptt yycptt Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I think now that we are using proto.Message we no longer have guarantee that there's no error upon encoding in task executor, but guess that should never happen.

Something for later, since the currentMemo/SA here are always consistent with CHASM tree state, in visibility task executor we actually don't have to invoke SA/Memo provider again, we can just serialize & buffer the value here (and now we can also return error upon closeTransaction if it can't be encoded), and use it in the vis task executor.

Count int64
}

ChasmExecutionInfo struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that persistence can depend on chasm package, for those new struct definitions, is it possible to reuse the corresponding definitions in chasm package and thus avoiding the conversion between the different struct definition of essentially the same thing in chasm_engine.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can definitely reuse the return struct between the vis manager and ListExecutions in chasm_engine.go, i'm currently keeping separate request types since there is a difference when calling vis manager (archetypeType in engine.go vs archetypeID in vis manager)

InitialFailoverVersion: resp.GetInitialFailoverVersion(),
IsGlobalNamespaceEnabled: resp.GetIsGlobalNamespaceEnabled(),
IsConnectionEnabled: request.GetEnableRemoteClusterConnection(),
IsReplicationEnabled: request.GetIsReplicationEnabled(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, thanks for catching this

var userMemo *commonpb.Memo
var chasmMemoPayload *commonpb.Payload

// Check if archetype matches to decide how to split memo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm why not just check if __chasm__ is a key in the combinedMemo?

if we are worrying about potential conflict (though I feel that chance is very low...), I'd probably just check if TemporalNamespaceDivision string is a number or not (just the first char should be enough), since we are going to use this method to query across archetypes, e.g. in namespace migration case, I just want all visibility records.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm concerned about potential conflict (although very low).
I'm ok with checking if TemporalNamespaceDivision is a number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to just checking if number. We can check the first char if we think that optimization is needed, just thought checking the entire string might be more clear.

Comment on lines 16 to 17
UserMemoPrefix = "__user__"
ChasmMemoPrefix = "__chasm__"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the var names.

}

// GetBool returns the boolean value for a given SearchAttributeBool. If not found or map is nil, second parameter is false.
func (m SearchAttributesMap) GetBool(sa SearchAttributeBool) (bool, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably use generics here and have just a single Get method.
It would require defining another TypedSearchAttribute[T] interface with a single typeMarker(T) method and making sure that SearchAttributeBool and friend implement that interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, i like this approach, making a change for this

}

alias := sa.definition().alias
boolValue, ok := m.values[alias].(VisibilityValueBool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should document this behavior. I'm also on the fence on whether we should error out here or panic or just return false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current behavior silently returns not found if any type checks fail. Added documentation for the runtime check before casting, returning a zero val and false if unable to cast successfully.

For the first cast/check to a VisibilityValue, the case this fails is when a Developer doesn't register search attributes and has duplicate aliases.

Second cast from VisibilityValue to underlying value would fail if search attribute author (me) did incorrect mappings between VisibilityValue and its type.

Unit tests are able to catch cases for the Second cast. Ideally, functional tests should be able to cover the first case? I'm not sure if there's any benefit to panicking, but I can see if we threw errors, we could also get different handling of cases, etc.

Let me know your thoughts.


typedSearchAttribute[T any] interface {
SearchAttribute
typeMarker(T)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes the mapping function more concise? Otherwise we need to switch on each visibility type to its underlying primitive right

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also makes it possible to specify the correct type in the method instead of needing to return any

Comment on lines 465 to 476
if reflect.TypeOf(visibilityValue) != sa.markedVisibilityType() {
return zero, false
}

reflectVal := reflect.ValueOf(visibilityValue)
targetType := reflect.TypeFor[T]()

if !reflectVal.Type().ConvertibleTo(targetType) {
return zero, false
}

result, ok := reflectVal.Convert(targetType).Interface().(T)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lines L465-474 are unnecessary. You could simply do result, ok := visibilityValue.(T). If ok is true, then it good, otherwise, it's not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea good point lol

Comment on lines 101 to 111
if sadefs.IsChasmSearchAttribute(fieldName) {
fieldType, err = vi.chasmMapper.ValueType(fieldName)
if err != nil {
return nil, query.NewConverterError("invalid search attribute: %s", name)
}
} else {
fieldType, err = vi.searchAttributesTypeMap.GetType(fieldName)
if err != nil {
return nil, query.NewConverterError("invalid search attribute: %s", name)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if sadefs.IsChasmSearchAttribute(fieldName) {
fieldType, err = vi.chasmMapper.ValueType(fieldName)
if err != nil {
return nil, query.NewConverterError("invalid search attribute: %s", name)
}
} else {
fieldType, err = vi.searchAttributesTypeMap.GetType(fieldName)
if err != nil {
return nil, query.NewConverterError("invalid search attribute: %s", name)
}
}
if sadefs.IsChasmSearchAttribute(fieldName) {
fieldType, err = vi.chasmMapper.ValueType(fieldName)
} else {
fieldType, err = vi.searchAttributesTypeMap.GetType(fieldName)
}
if err != nil {
return nil, query.NewConverterError("invalid search attribute: %s", name)
}

Copy link
Contributor Author

@awln-temporal awln-temporal Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

original suggestion was supposed to combine type maps, but got overridden while going through merge conflicts, changed to

combinedTypeMap := make(map[string]enumspb.IndexedValueType)
maps.Copy(combinedTypeMap, vi.searchAttributesTypeMap.Custom())
maps.Copy(combinedTypeMap, vi.chasmMapper.SATypeMap())
finalTypeMap := searchattribute.NewNameTypeMap(combinedTypeMap)

fyi keeping the combined type map initialization on separate line since there were potential panics if either custom or chasm search attribute type map is nil.

Comment on lines +412 to +413
t.logger.Warn("Failed to get field name for alias, ignoring search attribute", tag.NewStringTag("alias", alias), tag.Error(err))
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should still return an error. Why are you changing it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original thread: #8558 (comment).
The original context was about deregistering search attributes while the task is getting processed, which would lead to task failure.

After looking at this a bit more, any upsert SA task execution would still fail when generating the ES Doc and attempting to decode the value, if the search attribute has been deregistered.

In the SQL implementation though, since these fields are constant and always attached to the NameTypeMap/ClusterMetadata, Upserting these fields would succeed, so there is a discrepancy there.

I think we should not be throwing an error here though. Assuming we want parity with Upserting search attributes for workflows, once we have parity between ES and SQL for preallocated columns, we should not throw an error if a search attribute gets deregistered when executing the CHASM vis task, and just do a best effort upsert call. Otherwise, this task would get retried and sent to DLQ if I understand correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants